Instantaneous Speaker Adaptation Through Selection and Combination of fMLLR Transformation Matrices
نویسندگان
چکیده
This paper addresses instantaneous speaker adaptation, based on feature-space maximum likelihood linear regression (fMLLR), in the context of an automatic transcription task. We investigate the use of fMLLR-based adaptation when the need of a preliminary decoding pass for a speech segment is removed, as sufficient statistics for adaptation parameter estimation are gathered with respect to a Gaussian mixture model. To cope with limited adaptation data, in addition of using feature-space maximum a posteriori linear regression (fMAPLR), an investigation is conducted where the transformation matrix to be applied to the speech segment is estimated through selection and combination of pre-computed fMLLR transformation matrices. For speaker adaptively trained acoustic models results of recognition experiments show that the proposed approach is moderately better than fMLLR but not as good as fMAPLR.
منابع مشابه
Online Speaker Adaptation with Pre-Computed FMLLR Transformations
This paper presents a memory efficient single pass speech recognizer that makes use of pre-computed FMLLR transformations for online speaker adaptation. For that purpose we apply unsupervised segment clustering to the training corpus, create a transformation matrix for each cluster, and train a text-independentGaussian mixture classifier for cluster selection during runtime. We use the RWTH Aac...
متن کاملFeature and model space speaker adaptation with full covariance Gaussians
Full covariance models can give better results for speech recognition than diagonal models, yet they introduce complications for standard speaker adaptation techniques such as MLLR and fMLLR. Here we introduce efficient update methods to train adaptation matrices for the full covariance case. We also experiment with a simplified technique in which we pretend that the full covariance Gaussians a...
متن کاملFeature and model space speaker adaptati
Full covariance models can give better results for speech recognition than diagonal models, yet they introduce complications for standard speaker adaptation techniques such as MLLR and fMLLR. Here we introduce efficient update methods to train adaptation matrices for the full covariance case. We also experiment with a simplified technique in which we pretend that the full covariance Gaussians a...
متن کاملPreliminary Work on Speaker Adaptation for Dnn-based Speech Synthesis
We investigate speaker adaptation in the context of deep neural network (DNN) based speech synthesis. More specifically, our current work focuses on the exploitation of auxiliary information such as gender, speaker identity or age during the DNN training process. The proposed technique is compared to standard acoustic feature transformations such as the feature based maximum likelihood linear r...
متن کاملA fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation
We describe a novel maximum likelihood nonlinear feature bias compensation method for Gaussian mixture model–hidden Markov model (GMM–HMM) adaptation. Our approach exploits a single-hiddenlayer neural network (SHLNN) that, similar to the extreme learning machine (ELM), uses randomly generated lower-layer weights and linear output units. Different from the conventional ELM, however, our approach...
متن کامل